11 research outputs found

    A Moment in the Sun: Solar Nowcasting from Multispectral Satellite Data using Self-Supervised Learning

    Get PDF
    ABSTRACT Solar energy is now the cheapest form of electricity in history. Unfortunately, signi.cantly increasing the electric grid’s fraction of solar energy remains challenging due to its variability, which makes balancing electricity’s supply and demand more di.cult. While thermal generators’ ramp rate—the maximum rate at which they can change their energy generation—is .nite, solar energy’s ramp rate is essentially in.nite. Thus, accurate near-term solar forecasting, or nowcasting, is important to provide advance warnings to adjust thermal generator output in response to variations in solar generation to ensure a balanced supply and demand. To address the problem, this paper develops a general model for solar nowcasting from abundant and readily available multispectral satellite data using self-supervised learning. Speci.cally, we develop deep auto-regressive models using convolutional neural networks (CNN) and long short-term memory networks (LSTM) that are globally trained across multiple locations to predict raw future observations of the spatio-temporal spectral data collected by the recently launched GOES-R series of satellites. Our model estimates a location’s near-term future solar irradiance based on satellite observations, which we feed to a regression model trained on smaller site-speci.c solar data to provide near-term solar photovoltaic (PV) forecasts that account for site-speci.c characteristics. We evaluate our approach for di.erent coverage areas and forecast horizons across 25 solar sites and show that it yields errors close to that of a model using ground-truth observations

    Simultaneously Linking Entities and Extracting Relations from Biomedical Text Without Mention-level Supervision

    Full text link
    Understanding the meaning of text often involves reasoning about entities and their relationships. This requires identifying textual mentions of entities, linking them to a canonical concept, and discerning their relationships. These tasks are nearly always viewed as separate components within a pipeline, each requiring a distinct model and training data. While relation extraction can often be trained with readily available weak or distant supervision, entity linkers typically require expensive mention-level supervision -- which is not available in many domains. Instead, we propose a model which is trained to simultaneously produce entity linking and relation decisions while requiring no mention-level annotations. This approach avoids cascading errors that arise from pipelined methods and more accurately predicts entity relationships from text. We show that our model outperforms a state-of-the art entity linking and relation extraction pipeline on two biomedical datasets and can drastically improve the overall recall of the system.Comment: Accepted in AAAI 202

    Relating Romanized Comments to News Articles by Inferring Multi-Glyphic Topical Correspondence

    No full text
    Commenting is a popular facility provided by news sites. Analyzing such user-generated content has recently attracted research interest. However, in multilingual societies such as India, analyzing such user-generated content is hard due to several reasons: (1) There are more than 20 official languages but linguistic resources are available mainly for Hindi. It is observed that people frequently use romanized text as it is easy and quick using an English keyboard, resulting in multi-glyphic comments, where the texts are in the same language but in different scripts. Such romanized texts are almost unexplored in machine learning so far. (2) In many cases, comments are made on a specific part of the article rather than the topic of the entire article. Off-the-shelf methods such as correspondence LDA are insufficient to model such relationships between articles and comments. In this paper, we extend the notion of correspondence to model multi-lingual, multi-script, and inter-lingual topics in a unified probabilistic model called the Multi-glyphic Correspondence Topic Model (MCTM). Using several metrics, we verify our approach and show that it improves over the state-of-the-art
    corecore